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AN EFFICIENT ALGORITHM FOR VIDEO SUPER-RESOLUTION 
BASED ON A SEQUENTIAL MODEL 

P. HEAS*, A. DREMEAUt, AND C. HERZET* 


Abstract. In this work, we propose a novel procedure for video super-resolution, that is the 
recovery of a sequence of high-resolution images from its low-resolution counterpart. Our approach is 
based on a “sequential” model (z.e., each high-resolution frame is supposed to be a displaced version 
of the preceding one) and considers the use of sparsity-enforcing priors. Both the recovery of the high- 
resolution images and the motion fields relating them is tackled. This leads to a large-dimensional, 
non-convex and non-smooth problem. We propose an algorithmic framework to address the latter. 
Our approach relies on fast gradient evaluation methods and modern optimization techniques for non- 
differentiable/non-convex problems. Unlike some other previous works, we show that there exists 
a provably-convergent method with a complexity linear in the problem dimensions. We assess the 
proposed optimization method on several video benchmarks and emphasize its good performance 
with respect to the state of the art. 
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1. Introduction. Super resolution (SR) aims at reconstructing high-resolution 
(HR) images from distorted low-resolution (LR) observations. This type of methodol¬ 
ogy dates back to the 70’s with the pioneering work of Gerchberg m and Santis&Gori 
m- Since then, super resolution has been applied to a large variety of applicative 
domains, including infrared [29], medical [46], satellite and aerial |42l [51] imaging. 
We refer the reader to [38] for a pretty comprehensive overview of the works dealing 
with SR. 

One can distinguish between different setups in the domain of super resolution. 
“Single-frame” super resolution aims at computing an enhanced version of some HR 
image from the observation of one single LR image, see e.g., [HI [SO] [43] . On the other 
hand, the “multi-frame” paradigm typically focusses on the recovery of one HR image 
by exploiting the observations of several LR frames, see e.g., [^HTII^ 15711551155 ] . 
Finally, the “video” super resolution problem consists in estimating a sequence of 
HR images from the observations of their LR counterparts. We consider the latter 
paradigm in this paper. 

From a conceptual point of view, a simple (but valid) solution to address video 
super-resolution consists in applying single-frame or multi-frame procedures on each 
frame of the HR sequence to recover. This strategy was for example considered in 
[22l [4TJ [24j [26j [ 57 | [ 55 ] [ 55 ] . Nevertheless, this approach may fail in properly exploiting 
the strong temporal correlations existing between the (successive) frames of the HR 
sequence. Hence, procedures specihcally dedicated to accounting for these dependen¬ 
cies have been proposed in the literature, see [5ni[ini[iiiis5ii^[iii2Hiiini[iH]- a 
central element is the “sequential” model linking the frames of the HR sequence. More 
specifically, in most of these methods, the frames of the HR sequence are supposed to 
obey a dynamical model where each HR image is seen as a displaced version (by some 
unknown motion field) of the preceding one (see sectionfor a detailed description). 
This is in contrast with the standard multi-frame model where each LR observation 
is assumed to be a LR displaced version of one given reference frame. 
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The practical exploitation of the “sequential” model nevertheless faces a certain 
number of bottlenecks. The most stringent one is probably the model dimensionality: 
because it accounts for the temporal evolution of each HR frame, the number of 
variables involved in the sequential model may become very large. This makes video 
SR based on sequential models pretty challenging. As a matter of fact, in comparison 
with the huge number of papers dealing with SR, only a few have focussed on this 
particular problem, see [ini 123 mi [551 El 1131 Hi Si EE] ■ In [5i , the authors modeled 
the dependence between the different images of the sequence as a Gaussian process 
and provided an efficient implementation in the Fourier domain. Other contributions 
relied on adaptive-filtering techniques, see Ei HU Ei Ei Hi HI]. In this line of 
thought, most of the contributions cited above considered that the HR sequence is 
ruled by a state-space sequential model and the authors derived estimation procedures 
inspired by the well-known Kalman filter. The standard Kalman updates leading to 
a prohibitive complexity in the context of video SR, Elad and co-authors published a 
series of papers Hi HI Hi in which they proposed updates having a linear complexity 
in the problem dimensions. Their approach is based on some approximations of the 
model and/or Kalman updates (e.^., uniform translational motion [23], noise-free 
evolution model HI], etc). In |Ii|35], the authors considered a local approximation 
of the state-space sequential model by using steering kernel regression on the LR 
observations. 

In this paper, we provide an approximation-free methodological framework ex¬ 
ploiting a sequential model for video SR. We express the unknown HR sequence as 
the solution of a constrained optimization problem and propose an iterative procedure 
to solve the latter. Our method is provably convergent (to a local minimum of the 
problem) and has a tractable complexity per iteration (be., linear in the problem di¬ 
mensions) . The proposed framework encompasses two important ingredients of video 
SR, namely: */ a precise characterization of the motion fields linking the successive 
frames of the sequence; ii) the exploitation of proper priors on the unknowns of the 
problem. These two ingredients lead to additional difficulties (in top of the large di¬ 
mensionality) since they typically introduce non-convex and non-smooth terms in the 
cost function to minimize. We elaborate on these points in the two next paragraphs. 

A precise characterization of the model connecting the different images of the HR 
sequence is crucial for the success of video SR. Typically, videos are characterized 
by non-global motions. This is in contrast with many standard SR models of the 
literature which assume global motions {e.g., translation [T], affine [58] or projection 
m), well-suited to still image reconstruction. The imaging model in video SR thus 
takes a more-involved form and has to be considered with care. In particular, the 
estimation of the motion between two consecutive frames is usually tantamount to 
solving an optical-flow problem |4]. Embedding motion estimation in the SR recon¬ 
struction introduces new difficulties: */ it increases the problem dimensionality since 
two additional unknowns (the displacement in each direction) have to be estimated 
for each pixel of the HR images; ii) it typically introduces non-linearities in the im¬ 
age formation model. These obstacles are particularly prominent in the case of a 
sequential model because of the nested structure of the unknowns dependencies. As 
a consequence, until recently, motion estimation has been overlooked and considered 
as a side problem in many SR contributions involving either muti-frame or sequential 
models (see e.g., [HI HI HHl HIl ESI EH HI HI Hi E31EZ] ) with the exception of e.g., 

[IHiEIlES] ■ 

Interestingly, several authors have emphasized the importance of accurate motion 
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estimation in the video SR process and provided studies of the sensibility of adaptive 
hltering techniques to the latter, see [HU [HI UHl [TH] . In this paper, we show that 
the motion estimation can be included in our video SR problem without significantly 
increasing the computational cost. 

Another important ingredient for the success of video SR is the definition of proper 
priors on (some of) the unknowns of the problem. Indeed, video SR is a naturally 
ill-posed problem: typical setups impose the observation of (at most) one LR image 
per frame of the HR sequence; hence, if the motion between the different frames is 
unknown, it is easy to see that the number of variables which have to be estimated is 
well beyond the number of observations. In order to tackle this difficulty, a well-known 
technique consists in resorting to prior information on the sought quantities. This type 
of approach has been used extensively (but not only) in the context of single-image 
SR, where an HR image has to be reconstructed from one single LR observation. 
First methodologies based on prior information date back to the 70 ’s nzilll]. Since 
then, many types of priors have been studied, including Markov random fields m, 
total variation EZllinilMj, morphological [H] or sparse IM1I13 models, etc. Among 
the most effective models in the literature, many rely on the minimization of some 
non-differentiable functions. It is for example the case of SR techniques based on 
sparse representations where the decomposition coefficients of the sought quantity in 
a redundant dictionary are commonly penalized by an ii norm, see e.g., [25j . Another 
example is total variation where the ii norm is applied to the gradient of the sought 
images/motions, see e.g., [84] . The introduction of non-differentiable functions in the 
SR reconstruction leads to new conundrums since standard optimization techniques 
for smooth problems can no longer be applied. As mentioned previously, we address 
this problem in the paper as well. Hereafter, we mainly focus on problems involving 
an £i norm, although other non-differentiable convex functions could be processed 
using a procedure similar to the one exposed in this paper. 

In summary, in this paper we propose a methodological framework for video SR 
based on a sequential model. We consider the estimation of both a sequence of HR 
images and the motion fields relating them, while allowing for some non-differential 
terms in the cost function. Our approach is based on the combination of several 
modern optimization tools: fast gradient computation [H], the “Alternating Direction 
Method of Multipliers” [7] for large-scale non-differentiable convex problems, and 
a recent procedure for non-convex and non-differentiable optimization proposed by 
Attouch et al. in [1[3]. The resulting algorithm is ensured to converge to a local 
minimum of the problem while having a linear complexity per iteration in the problem 
dimensions. We illustrate the good behavior of the proposed method with respect to 
other techniques of the state of the art in several setups. 

The rest of the paper is organized as follows. We introduce the notations used 
throughout the paper in section [^ In section [^ we present the sequential model con¬ 
sidered in our subsequent derivations. In section [^ we express the video SR problem 
as a constrained optimization problem and provide a numerical procedure to solve it. 
The overall procedure is described in subsection |4.3| whereas two important algorith¬ 


mic building blocks are presented in subsections |4 .1 [ and 4.2 The numerical evaluation 


of the proposed method is carried out in section^ for different experimental setups. 


2. Notations. The notational conventions adopted in this paper are as follows. 
Italic lowercase indicates a scalar quantity, as in a; boldface lowercase (resp. upper¬ 
case) indicates a vector (resp. matrix) quantity, as in a (resp. A). The n-dimensional 
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vector of zeroes and identity matrix will be written as 0„ and I„. The i-th element 
of vector a is denoted a(i); similarly A{i,j) is the element of A located at row i and 
column j. The exponent * denotes the transpose operation. A subscript notation, as 
in at, will refer to the member of some sequence {at}^g = {ag, ai, • • • , ar}. 

Calligraphic letters, as H, denote functions. The subscript notation Hi may either 
denote the i-th element of a set {Hi}i or the i-th component of a multidimensional 
function H : R™ —>■ R"; the distinction between these two notations is usually clear 
from the context. The Jacobian matrix of H : R™ —>■ R" evaluated at a, denoted by 
Va'H(a), is defined as: 


/i9a(i)'Hi(a) ••• 5a(,„)'Hi(a) 


VaH(a) = 


e R” 


where dy is the partial derivative operator with respect to v. We use the notation 
Va'H*(a) to denote the transpose of VaH(a). 


3. Model. Let xt G R" be the image at time t of an HR video sequence re¬ 
arranged into a n-dimensional vector, with t G {0,... ,T}. Let us suppose that we 
capture noisy and low-resolution (LR) observations yt G R”^ with m < n oi the HR 
sequence: Vf G {0,..., T}, 


yt=n{xt)+Vt, ( 3 - 1 ) 

where r/t G R™ stands for some noise and H : R" —)■ R"* denotes a linear function, 
which is the composition of a low-pass filtering and a sub-sampling operation. We 
focus on the problem of recovering the HR sequence {xtj^g from the LR observations 

{yJ^Lo- 

Without any additional information, this problem is ill-conditioned since the num¬ 
ber of unknowns (that is (T -|- l)n) is larger than the number of observations (that is 
(T -|- l)m). One way to circumvent this problem is to take into account the relation 
existing between the HR images at different time instants. More specifically, as part 
of a video, we can assume that two consecutive images obey the following sequential 

modeQ 


Xt — 7^(xt+i, dt_|_i)-|-Ct-i-i, (3-2) 

where V : R" x R^" —R" is a “warping” function characterized by a displacement 
dt+i G R^", and et+i G R” is some noise. The choice of V is usually motivated 
by some conservation property, as for example the preservation of the pixel intensity 
along the displacement. One particular instance of function V, that we will consider 
in the sequel, is based on the well-known “Displaced Frame Difference” model. More 
specifically, this model assumes that the s-th component of 'P{xt,dt) admits the 
following series representation: 

'P^(xt,dt)= + dt{s)), (3.3) 

*6V(x(s)+dt(s)) 

^ We note that backward sequential models such as are common in the computer-vision 

literature. We therefore restrict our reasoning to the latter formulation. However, adapting the 
methodologies derived in section to a forward sequential model is straightforward. 
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where function returning the spatial position corresponding 

to index s, V(x(s) + dt(s)) denotes a subset of indices corresponding to the “neigh¬ 
borhood” of point x('S) + dt(s) and {V’i}r=i with ijji : R x K, —)• R is a family of 
bi-dimensional polynomial interpolation functions. In this case, (3.2)-(3.3) models 
the fact that xt can be seen as a displaced version of xj+i plus some additive noise. 
Let us note that V, as defined in ( |3.3[ ), is linear in Xj and polynomial in dj; it is thus 
a bi-polynomial function. Let us also mention that V typically only contains a few 
elements that is |V| ^ n, where |V| denotes the cardinality of V; this observation will 
play an important role in the sequel for the analysis of the complexity of the proposed 
SR methodology. 

The noise et+i in (3.21 accounts for all the modifications of the image xj which 
cannot be inferred from Xj+i and dt+i. This includes pixel occlusions, interpolation 
errors or variations of the scene illumination. Notice that, in practice, the choice of V 
should be made such that the residual noise e^+i is as small as possible. In particular, 
if e.t +1 = 0 (and dt_|_i is known), x^ is entirely determined from Xt_|_i Vt. Recovering 
the whole sequence {xt}^Q is then tantamount to recovering the last image xy. In 
such a case, the number of unknowns is therefore reduced to n and the recovery of 
the HR sequence from the LR images may be possible. 

Another option to decrease the ill-possedness of the video SR problem consists in 
restricting the family of signals to which the “initial condition’]^ x^ belongs. We will 
in particulai]^ consider the case where xy is assumed to be sparse in some (possibly 
redundant) dictionary D G R"^'^, that is 


xt = Dc for some c G R'^ such that ||c||o <C n, (3-4) 

where || .||o is the so-called ^‘£o norm”, which returns the number of nonzero coefficients 
of its argument. Dealing with ||.||o leads to combinatorial optimization problems. 
Hereafter we will thus consider the £i norm, which is a well-known surrogate to the Iq 
norm. In particular, if the sparsity of the sought vector is large enough, there exists 
an equivalence between the solution of the problems involving the £o and £i norms, 
see [^ . 

Finally, let us mention that the displacement dj between two successive images is 
rarely known in practice. It must therefore be inferred from the received LR images 
{yt}t=o- This may seem to be counterproductive since the estimation of d* implies 
an increase of the number of unknowns of 2n elements per time step. One way to 
circumvent this problem consists again in constraining dj to belong to a restricted 
family of signals. In this paper, we consider an implicit restriction by enforcing some 
non-negative function of dj to be small. More specifically, we assume that the sought 
displacement is such that 7^(G*dt) is “small”, where G = [gi,..-,g/t] G R^"^^ is 
some linear “analysis” operator and TZ : R^ —)■ R is a non-negative function. We 
note that this approach is commonly adopted in the computer-vision literature in 
which many options for TZ and G have been proposed, see [5]. This approach was also 
used in the “multi-frame” setting EH EH where the motions between the reference 
HR frame and the LR observations were penalized to have a small TV-norm. In the 
sequel, we will focus on the following choice for G and TZ: the elements of G*dt will 
correspond to the spatial gradients of (an interpolation of) d* at each point of the 


^We remind the reader that we consider a backward sequential model. 

®The sparsity constraints could be imposed on every xt without introducing any conceptual 
problems in the methodology exposed in section [4| 
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pixel grid; 7^(G*dt) takes the following form: 

n 

i=i jeSi 


P>0, 


(3.5) 


where w is defined as a vector of weights and Si’s denote disjoint subsets of elements 
of {1,..., h}. Index i represents a location on the pixel grid. The subset Si typically 
gathers 4 elements corresponding to the 2 spatial gradients of the 2 components of 
motion dj at the location indexed by i. For p = 2, these choices are equivalent to 
constraining the spatial gradient of the displacement field d( by a quadratic penal¬ 
ization [^, whereas the case p = 1 corresponds to the weighted total variation (TV) 
approach suggested in mj. 

In summary, (|3.1|)-(3.4) together with the definition of G and TZ specify our 


prior/observation model. In the next section, we will present a low-complexity method¬ 
ology exploiting this model to recover the HR sequence from the collected observations 
{yt}t=o- l^ore specifically, we will assume that the unknowns of the problem include 
the HR sequence {xt}^g, the sequential noise {€t}t=i, the displacements {dt}^;^ and 
the decomposition vector c. All the other parameters of the problem will be supposed 
to be known, although they could easily be included as additional unknowns without 
introducing any conceptual problem in the proposed methodology. 


4. The Estimation Procedure. In this section, we expose our methodology 
to estimate the HR sequence by exploiting the model described in section Our ap¬ 
proach is based on the resolution of a constrained optimization problem. We introduce 
the following shorthand notations: x = {x(}^q, e = and d = Our 

SR reconstruction procedure relies on the following constrained optimization problem: 


argmin J'(x, e, d, c) s.t. 

(x.e.d.c) 


Xt — ’P(Xt + l, dt_|_i) + £t+l, 

XT = Dc, 


0 < f < T- 1, 


(4.1) 


where 


J(x,e,d,c) 


c) = 


T 

El 


'H{xt) - YtWl + ai 


T 

El 


■a2^7^(GM0+a3||c| 


t=i 


for some aj > 0, j S {1,2,3}, and p > 0. Let us make a few comments about (4.1). 


The first constraint ensures that the images of the HR sequence verify the sequential 
model (3.2); the second enforces that prior model (3.4) is satisfied. Each term in 
the cost function (x, e, d, c) has a clear physical meaning: the first term penalizes 
the discrepancies between the predicted and the received observations; the second 
penalizes the noise on the sequential model; the third enforces the displacement to 
have some regularity and the last one constrains c to have some desirable properties 
(depending on the choice of p). For example, setting p G [0,1] typically promotes the 
sparsity of c, see [25]. Because sparsity has revealed to be a good prior in a number 


of works, in the following our main objective is to find a solution to (4.1) with p = 1. 

Problem (4.1) involves a huge number of unknowns (namely (4r -|- l)n -I- q vari¬ 
ables if X, e, d, c have to be estimated). Hence, solving (|4.1[) may be critical even for 


reasonable problem sizes: for instance, considering images of n = 2® x 2® pixels, a 
non-redundant dictionary, i.e., q = n and a sequence length T = 2^, we have that 
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the number of variables involved in the optimization problem grows up to roughly 
2^^. Clearly, such a high-dimensional problem can only be addressed by specifically- 
dedicated procedures. In subsection |4.3[ we propose an overall methodology to solve 
(4.1) efficiently with p = 1. Our approach is based on the combination of several 
modern optimization tools, described in subsections |4.1| and |4.2| More specifically, 
the building blocks presented in subsections |4.1| and |4.2| tackle simplified versions of 


problem (4.1), which appear as intermediate steps in the overall procedure described 
in section j4.3[ We briefly comment on these intermediate problems in the next para¬ 
graphs. 


In section 4.1 we consider the case where p = 2 in (4.1), that is all the functions 


are differentiable. In such a case, we show that the gradient of the cost function 


associated to an (equivalent) unconstrained version of (4.1) can be evaluated efficiently 


by resorting to optimal control techniques [5] . More specifically, we emphasize that the 
complexity associated to the evaluation of the gradient of the cost function remains 
linear in the problem dimensions, for many setups of practical interest. 

In section |4.2[ we focus on the case where d is known but p = 1. The corre¬ 
sponding optimization problem is then convex but not differentiable. Building on 


our derivations in section 4.1 we emphasize that this type of problem can be nicely 
addressed by resorting to the so-called “Alternating Direction of Multipliers Method” 
(ADMM) [7], a modern optimization technique proposed to handle large-scale non- 
differentiable optimization problems. 


Finally, in section 4.3 we consider the general problem (|4.1[), where 


have to be estimated and p = 1. In this case, (4.1) is non-convex (because the term 


d(+i) appearing in the constraints is bi-polynomial) and non-differentiable. 
In order to address this problem, we resort to an optimization procedure introduced 
by Attouch et al. ma, and particularized in [?a to multi-frame SR. The proce¬ 


dure is iterative and exploits the building blocks derived in sections |4.1| and |4.2| to 
solve intermediate problems. The complexity per iteration is linear in the problem 
dimensions. Moreover, from the arguments exposed in [45j . it can be shown that the 
proposed procedure is convergent to a critical point of the problem. 


4.1. The First Building Block. In this section, we assume that p = 2 (so 
that all the functions appearing in (4.1) are differentiable) and show that an efficient 
resolution of (4.1) via gradient descent algorithms exists. Our approach is based on 


fast gradient evaluation techniques as exposed in [5]. 


In order to present our methodology, we first reformulate (4.1) as an (equivalent) 


unconstrained problem. Notice that, because of the constraints in problem (4.1), any 
xt can be expressed as a deterministic function of c and { In other 

words, there exists a function Q(e, d, c) : x x E'^ —?> such that, 


given e, d and c, x = Q(e, d, c) is the unique vector satisfying the constraints in (4.1). 
As a consequence, (4.1) can also be equivalently expressed as 


arg min J'(e,d,c) 

(£,d,c) 


(4.2) 


















where 


J(e,d,c) =J{y. = Q(e,d,c),e,d,c), 


= ^ l!'H(Qt(e, d, c)) - ytWl + ai XI 




0-2 


X^(G*dO + a3l|c| 




(4.3) 


and Qt(e,d,c) is the restriction of Q(e,d,c) to x^. 

Since p = 2, (4.2) is a smooth unconstrained minimization problem and can thus 
be solved by any procedure belonging to the family of gradient descent algorithms. At 
this point, let us make two remarks: i) J7(e,d,c) has usually an intricate structure 
and its gradient does therefore not have any simple analytical expression; ii) the 
computation of the gradient of (e, d, c) via finite differences is out of reach for the 
considered problem because it would require to evaluate the cost function twice as 
many times as the (huge!) number of variables. 

As a consequence, the main bottleneck for solving ( |4.2[ ) lies in the tractable eval¬ 
uation of the gradient of i7(e, d, c). We emphasize in appendix [A| that the particular 
structure of J{e, d, c) enables the use of a specific methodology with a complexity 
scaling linearly with the problem dimensions. More specifically, let 

5o(xo) = ll'H(xo) - yolli 

0t(xt,et,dO = ||H(xt)-yt||2 + al||et||P + a27^(GMt), for l<t<T-l, 

Qt{xt, eT,dT,c) = |j'H(x7’) — yrll^ + Q^iII^tIIp + 0272.(G*dr) -I- aallcHp. 

(4.4) 

Using the notation QT{eT,dT,c) = Qt{^t = Dc, e^, dy, c), the elements of the 
gradient of (e, d, c) at (e', d', c') can then be evaluated as follows: 


Vd,^(e',d',c')=Vd,7^*(x;,d()C,_i+Vd,0t(x;,e(,d(), 
Ve, J(e',d',c') = 

Ve^(e', d', c') = D*Ct + Vc0t( 4, d^, c'). 


(4.5) 


where the variables x(, e(,d( and c' must satisfy the constraints of problem (4.1), 
that is 


x'rp = Dc', 

Xt = dj^]^) -|- t = T — 1 ,..., 0, 


(4.6) 


and the sequence of “adjoint” variables {Ct}t=o obeys the following recursion: 


r Co = vgo(x;,), 

^ Ct = v^, 7 ^*(x(,d;)Ct_i + Vx,gt(x(,e;,d;), t = i,...,T-i, (4.7) 

[ Ct = Vxr’^*(xT, d'rp)CT-l + VxT0T(Xr, Ct: d^, c'). 


Expressions in (4.5) together with recursions (4.6) and (4.7) provide an efficient way 
to evaluate the gradient of J'(e, d, c). The overall methodology can be understood 
as a 3-step procedure: i) given some values of e',d' and c', evaluate {x(}^q with 
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recursion (4.6); ii) use the value of {xj}^q to evaluate the adjoint variables {Ctl^o 
from (4.7); Hi) compute the gradient of J'(e,d,c) by using (4.5). Note that the 


gradients appearing in the right-hand side of (4.5) and (4.7) typically have simple 


analytical expressions and are thus straightforward to evaluate. 

It is easy to see that the complexity induced by this methodology scales (at worst) 
as 0{v?T + nq) since it only involves matrix-vector multiplications, with matrices of 
dimension n x n or n x q. In practice, this complexity can usually be reduced to 
0{nT -I- q), or simply to 0{nT) in the case of a non-redundant dictionary. This 
linearity in the problem dimensions occurs if the matrices involved in ( |3.4[ ), ( |4.5[ ) and 
(4.7) are typically very sparse and/or rely on fast transforms of linear complexitjQ 
In the (typical) example (3.3) considered in this paper, we clearly obtain this linear 
complexity since |V(x(s) -1- d((s))| <C n. In the rest of the paper, we focus on model 


(3.3) and choose a dictionary so that the complexity related to (4.5)-(4.7) is linear in 


the problem dimensions. 

Before concluding this section, let us open a parenthesis to highlight some connec¬ 
tions with some previous works which considered the “Kalman smoother” update rules 
as the starting point of their video SR method, see [20]. First notice that, assuming 
d is known, (4.2) with p = 2 corresponds to the “Maximum a Posteriori” (MAP) es¬ 
timation problem associated to the following probabilistic (backward) state-evolution 
model: 



.AA(0„,a3-iDD*) 

' A/'(7^(x(_|_i, dt+i), ^I„) 

.AA(H(xt),I™), 


(4.8) 


where v ^ A/'(m, P) indicates that v is distributed according to a multivariate normal 
distribution with mean m and covariance P. 

For such a model, it is well-known that the Kalman smoother can compute ex¬ 


actly the solution of (4.2) in a hnite number of steps, namely one forward and one 
backward recursions, see e.g., [311 Chapter 20]. The Kalman smoother involves the 
update of a length-n mean vector and an n x n covariance matrix at each step of the 
two recursions; moreover, the evaluation of these quantities requires the inversion of 
a. n X n matrix. Hence, the Kalman smoother exhibits a computational complexity 
scaling a^O{n^T). Since this complexity is prohibitive for most practical setups, 
several approximations of the Kalman updates for video SR have been proposed in 
[2()| . On the other hand, the procedure described in this section provides an alter¬ 


native, approximation-free, solution to the MAP problem. Indeed, since (4.2) with 


p = 2 is a differentiable problem, it can be solved with a simple gradient descent 
method. More specifically, we can apply the methodology described in this section 
to efficiently compute the gradient of 1 /( 6 , d,c) with respect to e and c (using the 


two last rows of (4.5)). The complexity of this method then only scales as 0{nT) 


per iteration. Moreover, because J'(e,d,c) is strictly convex in (e,c), this type of 
algorithm is ensured to converge to the global minimum of the problem. Hence, if the 
descent algorithm has converged (close) to the minimum after a reasonable number 


^This is the case for any non-redundant wavelet basis, which will induce an overall complexity 
of 0{nT). Fast transforms for sparse redundant dictionaries such as curvelets frames also exist 
but imply a slight complexity overload since the matrix-vector multiplication scales in this case as 
O(nlogn), yielding an overall complexity of 0{n(T -h logn)). 

^We note that the complexity can be reduced to 0{T{m^ mn)) by using some computational 
tricks as the well-known Woodbury matrix identity, see e.g.^ [361 Lemma 4.1]. However, the latter 
still remain too costly for typical problem sizes. 
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of iterations, the proposed methodology drastically reduces the complexity necessary 
to obtain the MAP solution as compared to a Kalman smoother. 


4.2. The Second Building Block. In this section, we address problem (4.11 
with p = 1 but assume that d is known (and thus therefore no longer appear as an 
optimization variable in (4.1)). Particularizing (4.1) to these working hypotheses, 
we obtain the following convex but non-differentiable problem: 


argmin J'(x, e,c) s.t. 

(x,e,c) 


Xt ='P(xt+i,dt+i) + et+i, 0<t<r-l 
XT’ = Dc, 


(4.9) 


where 


J(x,e,c) = '^\\'H{Kt) - YtWl + ai'^WetWi + 03 ! 


t=o 


t=i 


This problem is convex but non-differentiable. As previously, the main bottleneck for 
its resolution lies in its high dimensionality. This, in turn, imposes to resort to low- 
complexity optimization procedures. We show hereafter, that a complexity scaling 
linearly with the problem dimensions is possible by using the “Alternating Direction 
Method of Multipliers” (ADMM). ADMM has recently emerged in the optimization 
community to address large-scale optimization problems. Among the particular assets 
of this type of method, let us mention: i) its robustness (the convergence to a global 
minimum is ensured under very mild conditions); ii) its rapid convergence to an 
acceptable accuracy (typically a few tens of iterations is sufficient). We refer the 
reader to appendix [B] for a short description of the ADMM framework. 


In order to derive the ADMM recursions, we first need to reformulate (4.9) in the 
standard form (B.l) in appendix [b} Letting 


D A 


(x,e,c) 


x* — 7^(xt_|_i, dt_|_i) -I- Ct+i, 
xt’= Dc 


0 < t < T-1 


(4.10) 


(4.9) can be reexpressed as: 


arg min E ll'H(xt) - ytWl + ai X! + “3||c||i, 

(x,£,c)Gn 



Here, we have added two new variables to the problem, e and c, which are counter¬ 
balanced W the inclusion of two new constraints. Using the formalism exposed in 
appendix [ b] with Zi = (x, e, c), Z 2 = (e,c). Si = D and S 2 = we obtain the 
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following ADMM recursions: 

(x(fe+i)^e(fe+i)^c('=+i)) = arginin£W(x,e,c) 


s.t. 


Xt — 7 ^(x4_|_i,+ et_|_i, 0 < f < T — 1 

XT’ = Dc, 


where pi,p 3 > 0 and we have introduced the function: 


(4.11) 


^(fe+1) _ 
— 

~(fc-l-l) _ 

argmiuj^ ||et||i-h 
argmiug ||c|li-h 

- -1- 2> 

-c-f 

(4.12) 

= 

„(fc) , Afe-Hi) 


(4.13) 


U« +c('=+i) 



£('=)(x,e,c) = li'^(xt) - 


y*ll2 


t=0 




€ — e 


(k) 


u 


{k)\ 


P3\ 

2 


|c-cW 


U, 


(fc)l 


(4.14) 


Equations (4.11), ( 4.12[) a nd (4.13) correspond respectively to expressions ( B.2[ ), ( |B.3| ) 
and (B.4) in appendix Bl 


Let us make the following remarks about the different steps of the ADMM pro¬ 


cedure. First, problem (4.11) has the same structural form as the problem addressed 
in section 4.1 in particular, all the terms of the cost function appearing in (4.11) are 


differentiable while the set of constraints imposed on x, e, d and c is strictly the same. 
We can thus apply the methodology described in section [DT] to solve this problem via 
a gradient descent algorithm, with a complexity per iteration scaling as 0{nT). Inter¬ 
estingly, let us mention that, under very mild conditions, the convergence of ADMM 


is still guaranteed if the minimizations in (4.11 )-(4.12) are not performed exactly. 


see e.g., m Theorem 8]. This suggests that the number of gradient steps carried 
out to search for the minimum of (4.11) can be rather limited without affecting the 


convergence of the overall ADMM process. 


Second, the optimization problems specified in (4.12) have a very simple analytical 


solution. In fact the right-hand sides of (4.12) correspond to the definition of the 
proximal operator of the £i norm. The latter has been extensively studied in the 
literature (see e.g., [40l section 6.5.2]) and possesses a simple analytical solution based 
on soft-thresholding operators. In particular, we have 


=softfu -Hui^^(i) 

Pi \ 

g(fc-i-i)(j) = softer. (-I-Uc*^(t) ' 


Vi, 


(4.15) 


P3 


where 


softA (a) = 



if a > A, 
if a < —A, 
otherwise. 


(4.16) 


We note that the solution of (4.12) is typically sparse since the soft-thresholding 


operator (4.16) enforces the small coefficients to be equal to zero. Moreover, we see 


from (4.15) that the complexity of this ADMM step clearly scales as 0(nT -j- q). 
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As a conclusion, since the last step (4.13) of the procedure only involves vector 
additions, the particularization of ADMM to our problem leads to an algorithm ex¬ 
hibiting a complexity per iteration scaling linearly in the problem dimensions. 


4.3. The Overall Procedure. Let us now concentrate our attention on our 


target problem, that is (4.1 1 with p = 1, where all the variables x, c, e, d have to be 
estimated. The cost function then contains both non-differentiable and non-convex 
terms. In such a case, ensuring the convergence to a global minimum is usually out 
of reach for any deterministic optimization procedure. In this section, we consider 
an optimization method proposed in mm and particularized to multi-frame super¬ 
resolution problems in |45] . This procedure addresses optimization problems involving 
a cost function satisfying the so-called “Kurdyca-Lojasiewicz” property and is guar¬ 
anteed to convergence to a critical point of the latter under mild conditions. We refer 
the reader to mm for more details about “Kurdyca-Lojasiewicz” functions. Here, we 
just mention that functions made up of the composition of piecewise polynomial func¬ 
tions obey the “Kurdyca-Lojasiewicz” property. Scrutinizing the structure of (4.1) 


and taking (3.3) into account, it is easy to see that our cost function is piecewise 


polynomial; the optimization framework developed in mm therefore applies. 

Our methodology obeys a 2-step recursion which follows the same lines as the 
procedure presented in The building blocks described in subsections 4.1 and 4.2 


are used to provide an efficient implementation of the intermediate problems appearing 
in these two steps. To express the procedure recursions, we focus on the unconstrained 


formulation (4.2) (with p = 1) of our general optimization problem (4.1). The first 
step of the procedure solves the following problem: 


(g(fe-i-i), c(fc+i)) = argmin ^/(e, c) -|- 7 C(e — e^^\c — 

(e,c) 


(4.17) 


where 7 > 0, is the cost function in ( |4.3[ ) with p = 1 and C : x —>■ M+ is a 

non-negative proper lower-semicontinuous convex function such that C{0nT,0q) = 0 . 
It thus consists in minimizing the (penalized) cost function J^{e, d, c) over the subset of 
variables (e, c); the penalizing term C plays the role of a “cost-to-move” function which 
prevents the new iterate (e(^+i), 0 *^^+^)) from differing too much from the previous 
one. In the sequel, we will focus on the following penalizing terir0 

T 

C(e - eW,c - cW) = ^ - e«)||i + ||c - cW||i, (4.18) 

t=i 


where H G is a wavelet basis. The operational meaning of this “cost-to-move” 

function is as follows: the £i norm enforces its argument to be sparse; hence, the 
second term in (4.18) ensures that the number of nonzero coefficients in does 

c«. 


not differ too much from the one in 


while the first term plays the same role 


for the wavelet coefficients of Using this type of “cost-to-move” is not 

mandatory for the convergence of the proposed procedure. However, it has been 
shown empirically in |45) that it is well-suited to avoid some undesirable local minima 
of the cost functiord 


®In theory, the ti-norm should be substituted by a smooth approximation to prove convergence 
towards a critical point of the cost function, as done in m- In practice, we note that this substitution 
does not impact convergence. 

^An intuitive explanation is that the cost-to-move | |4.18| l induces a “coarse-to-grain” refinement 
of the unknowns which is usually beneficial in computer-vision problems, see details in m- 
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In the second step of the recursion, we update the velocity field d as: 


(j(fe+i) = argniinS(d, + a 2 


(4.19) 


t=i 


where iB(d,d^^^) is a quadratic approximatioij^of ;B(d) = d, 

YtWl, that is 


(k) 


B{d, d^'^)) ^ e(d('=)) + Vde*(dW)(d - d('=)) + —||d - d^ll 


> 0. (4.20) 


The choice of is of course not arbitrary and should be made so that the conver¬ 
gence of the procedure is ensured. We elaborate on this point further in this section. 
For now, let us first discuss the practical implementation and complexity of recursions 


(4.17)-(4.19). It should be noticed that the building blocks presented in sections 4.1 


and 4.2 can be exploited to solve efficiently these steps. Indeed, problem (4.17) has 


the same structural form as the one considered in (4.9): the cost function consists in 


a quadratic term plus a set of convex but non-differentiable terms. We can thus use 
the ADMM procedure described in section m to address it. In the same way, we see 
from definition (14.201) that the cost function ( 4.191) is made up of a quadratic term plus 


some non-differentiable function 0:2 7^(G*d(). Hence, the ADMM procedure de¬ 

scribed in section 4.2 can also be applied here to solve (4.19). In comparison to our 


exposition in section [4. 2 [ only the proximal operators of the non-differentiable terms 


will change when ADMM is applied to (4.17) and (4.19). In particular, the computa¬ 


tion of any gradient of the differentiable part of the cost functions can be efficiently 
evaluated via the procedure described in section [4.1| We particularize the expression 


of the proximal operators appearing in the ADMM implementation of (4.17)-(4.19) 
in appendix |C.2[ As previously, it turns out that the implementation of the latter 


only requires a linear complexity. The complexity of each iteration of (4.17), (4.19) 
is thus once again linear. 

To conclude this section, let us discuss the convergence of the proposed proce¬ 
dure. In [ISl Theorem 1], the authors proved that if J{e, d, c) satisfies the “Kurdyca- 
Lojasiewicz” property and the a^^^’s are properly selected, the sequence defined in 
(4.17)-(4.19) is either unbounded or converges to a critical point of 17(6, d,c). A 
procedure to properly select factors is exposed in [33 section 2.3] and is easy to 
implement in practice. Particularized to the setup considered in this paper, this pro¬ 
cedure reads as follows: select = 2®^ with ^ > 0 and with i the smallest positive 
integer such that 

g(d7+i)) - H(d('=)) ~ - d('=)f -f VdH*(dW)(d('=+i) - d^'^)) 


-I- ai 


1 

'^(n{G*d[^^)-n{G*d 


(fc+1) 




(4.21) 


As mentioned at the beginning of the section, the cost function J'{e,d,c) is piece- 
wise polynomial and therefore satisfies the “Kurdyca-Lojasiewicz” property. Hence, 


the sequence defined by (4.17), (4.19) with factor selection (4.21) is either unbounded 


or converges to a critical point of J(e, d, c). Finally, let us note that the boundedness 


®Note that Z3(d) is similar to the first term of the cost function in l|4.3[l. 
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of {(eW,dW,cW)}fc is usually observed in practice or is easy to enforce by adding 
box constraints to the optimization problem. 


5. Experiments. In this section, we provide an experimental validation of the 
SR procedure proposed in section We focus on the problem of recovering a se¬ 
quence of HR natural images from blurry and LR observations. In section |5.1[ we 
provide a precise definition of the model parameters used to run our algorithms. In 
section 5.2 we describe several algorithms of the state of the art which will serve 
as points of comparison with the proposed approach. In sections 5.3 and 5.4 we 


respectively describe the databases and the figures of merit which will be used in our 
experiments. Finally, a discussion of the performance of the proposed SR methodol¬ 
ogy is provided in section |5.5[ 


5.1. Specification of the Model and Algorithm Parameters. We first dis¬ 
cuss the choice of the parameters appearing in the model described in section In 
particular, we specify the definitions of V, D, G and TZ. We then provide some 
details about the parameters used in our algorithm. 

The observation model H is defined as the composition of a low-pass filtering and 
down-sampling operation. The low-pass filter is assumed to model the blurring effect 
induced by the camera transfer function. In our simulations, we use an approximation 
of a Gaussian kernel with a standard deviation equal to 1.12, as proposed in A 
down-sampling factor equal to 2 is considered. 

The operator V is supposed to model a “displaced frame difference” (DFD): 
V is thus defined as in (3.3) with the interpolation functions {V'iliLi equal to bi- 
dimensional cubic cardinal splines |54) . This representation offers a reasonable accu¬ 
racy with a complexity scaling linearly with the image dimension, see appendix |C.1| 
for further details. 

The dictionary D is chosen so that natural images have a sparse representation 
as a combination of a few of its columns. Several choices of such dictionaries have 
been proposed in the literature, see, e.(?., [33 H3]. Hereafter, we consider a dictionary 
made up of discrete real-valued curvelets [12] : curvelets are known to yield sparse 
representations of piece-wise smooth functions. The choice of a curvelet dictionary is 
also motivated by the existence of fast algorithms for the computation of the prod¬ 
uct between D and some vector, see m- this transform is based on a fast Fourier 
transform and its complexitjj^ scales as 0{n\ogn). 

Matrix H appearing in the cost-to-move function in (4.18) is chosen to be a Haar 
wavelet basij^ In practice, we did not observe a significant difference in our results by 
using other types of wavelets; we thus essentially consider Haar wavelets for simplicity 
purposes. 

To complete our discussion, let us elaborate on the choice of G and TZ, charac¬ 
terizing the regularization imposed on the displacement field dj. In our simulations, 
we wish to enforce either a global or a piecewise regularity of the motion. We pro¬ 
ceed as follows. The spatial derivatives of the motion are approximated by a “fi¬ 
nite difference” scheme: each finite difference corresponds to a particular element of 


®As mentioned earlier, a linear complexity can be preserved by using, for example, a wavelet 
basis instead of a curvelet frame. 

^'^We note that evaluation of products H or H* only requires a linear complexity since they can 
be implemented by fast wavelet transforms 1351 Chapter 7]. 
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a\ 

Pi 

012 

P2 

Q3 

P3 

7 

P 


5e-l 

le2 

8e3 

lei 

lei 

le-2 

leO 

leO 

2e2 


Table 5.1 

Algorithm parameter setting. Parameters ai, a 2 and 03 appear in the cost function ( |4.1[ |. 
Parameters 7 and specify the 2 steps | |4.17| l and | |4.19[ | . Parameters pi, p 2 , P 3 and p are 
auxiliary factors used in the ADMM recursions. 


the matrix-vector product G*dt (matrix G thus contains “±1” elements located at 
proper positions). The regularity of the motion field is then enforced by constraining 
the function 7^(G*d() to be small. In our experimentations, we choose TZ to be de¬ 
fined as in (3.51 with a weighting vector w as in m- Further details are provided in 


appendix C.l 


Besides, we notice that, although we have presented our SR procedure in the case 
of a mono-channel image-sequence observation in section its extension to a multi¬ 
channel setting {e.g., when 3-channel color images are available) is straightforward 
and will be considered in our simulations. 

We now specify the choice of the algorithm parameters. As exposed earlier, we 
rely on the recursion (4.17)-(4.19l described in section 4.3 to search for a critical point 
of the cost function in (4.11 with p = 1. Each step of the recursion (4.17l-(4.19) is 
solved via an ADMM procedure. Details on the ADMM steps are given in appendix 
m The ADMM solvers involve minimizations by a gradient descent procedure. In 
our implementation, we choose a quasi-Newton descent method adapted to our high¬ 
dimensional problem, namely a limited-memory Broyden-Fletcher-Goldfarb-Shanno 
(L-BFGS) procedure with a line-search routine based on the strong Wolf conditions 
[HH] . We stop the ADMM recursions after 20 iterations and the global 2-step recursions 
after 20 iterations, since we observed no significant improvements of the results for a 
larger number of iterations. 

The super-resolved images x^’s are initialized by Lanczos interpolation of their 
low-resolution counterparts. Motion fields are initialized with an upscaled optic-flow 
estimate obtained by applying algorithm m on the low-resolution observations. To 
perform a fair comparison with the “multi-frame” SR algorithm of Mitzel et al. [37) 
described in the next section, we also ran our algorithm with an initial motion field 
computed with the optic-flow algorithm m- For both initializations, the upscaling 
from the low-resolution optic-flow estimate to the high-resolution motion field is done 
with a Lanczos interpolation. The values of the other parameters of our algorithm 
are given in table 5.1 These parameters have been tuned experimentally to lead to a 
reasonable tradeoff between visual inspection and error measurements for the data-set 
benchmark presented in section [5.3| 


5.2. Algorithm Benchmark. The assessment of the proposed algorithm relies 
on a comparison with a benchmark of three state-of-the-art methods: 

• the “single-frame” SR algorithm of Peleg et al., 2014 [43], 

• the “kernel-regression” SR algorithm of Takeda et al., 2009 [^ 148) . 

• the “multi-frame” SR algorithm of Mitzel et ai, 2009 [37) . 
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These algorithms are adapted to the super-resolution of videos exhibiting non- 
homogeneous displacements. Moreover, each of these three algorithms is a state-of- 
the-art method representing a class of SR algorithms. The algorithm of Peleg et ai, 
2014 implements a “single-frame” SR method based on a statistical learning pro¬ 
cedure with sparse representations; the algorithm by Takeda et al., 2009 is a SR 
method based on a “multidimensional kernel regression” fitting the low-resolution ob¬ 
servations; the algorithm of Mitzel et ai, 2009 implements a multi-frame SR method 
using a quadratic relaxation scheme for high-accuracy optic-flow estimation |61j . Fi¬ 
nally, we also compare the performance obtained with the proposed method with two 
standard spatial interpolation techniques, namely 

• the basic nearest neighbor upscaling (block interpolation), 

• Lanczos interpolation |52] . 

Note that in order to treat color image sequences, algorithms only supporting gray- 
level images are run independently on the three spectral bands. 


5.3. Data-set Benchmark. We evaluate the performance of the algorithms us¬ 
ing a benchmark of three image sequences: 

• A synthetic sequence from the MPI Sintel data set m- This recent data set, 
which is derived from the open source 3D animated short film, was originally 
created for the evaluation of optical flows. The synthesized image sequences 
are realistic and particularly challenging: on the one hand, displacement fields 
are characterized by large amplitudes, discontinuities, blur or defocus effects; 
on the other hand, the image sequence presents many occlusions, specular 
reflections or atmospheric effects. In our simulations, we focus on a region 
of interest of 436 x 512 pixels and on the 8 first images. The first and last 
images of the “bandage” data-set sequence are displayed in Fig. In the 
following, we will refer to this sequence as data set #1. 

• A real sample of the standard “foreman” videcf^ In our simulations, we 
focus on a region of interest of 256 x 256 pixels and on the 10 first images. 
The first and last images of this data set are displayed in Fig. In the 
following, we will refer to this sequence as data set 4 f2. 

• A real sample of the challenging “football” video^^, which exhibits non- 
homogeneous and large displacements, as well as multiple occlusions. In our 
simulations, we focus on a region of interest of 256 x 256 pixels and on the 10 
first images. The first and last images of this data set are displayed in Fig. 

In the following, we will refer to this sequence as data set #3. 

The images of these sequences are composed of three spectral bands, each one is coded 
in 8 bits. We create the LR images by applying function H on these sequences. This 
function first filters the discrete signal by a Gaussian kernel of standard deviation 
equal to 1.12 and then down-samples the result by a factor of 2, see [9]. 


^^Image sequences are part of the Derf Collection, which can be downloaded at 
https://media.xiph.org/video/derf [Online; posted 15-October-2015] 
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5.4. Evaluation Procedure. The performance of the algorithms is assessed in 
terms of reconstruction of the super-resolved image and estimation of the motion field. 
We describe the figures of merit used in our assessments hereafter. Let {xt}^Q (resp. 

denote the estimated image sequence (resp. displacements) and 
(resp. the corresponding ground truth. Standard criteria [35] to measure 

the image sequence reconstruction accuracy are the peak-to-signal ratio (PSNR) at 
time t: 


PSNR(t) = 201ogio 


llxf"^ -xtib’ 


and the correlation coefficient (CC) at time t: 


CC(t) 


||xf“^-PxJ—Iblixt-Mxtib’ 


where we have denoted the arithmetic mean of vector Xj and x*’'“® by and . 

We evaluate the accuracy of the estimated motion fields with the time-averaged Mean 
End Point Error (MEPE): 


MEPE=^^||dr“^-d*||i, 

i=l 

and the time-averaged Mean Barron Angular Error (MBAE) in degrees |5]: 


1 -f dj™®(s)dt(s) -I- d*™®(s -I- n)dt(s + n) 

(1 -I- dt(s)2 -t- dt(s -I- n)2) (1 -I- d*’’“®(s)2 -|- d*’’"®(s -f n)^) 

where we have adopted the convention that the two n-dimensional components of 
motion have been sorted one after the other in vectors d^ and 

In order to compare the different algorithms (algorithm |33| does not support 
large images and exclude pixels at the image border), the criteria PSNR and CC are 
evaluated on a spatial window of size of 240 x 240 cropped in the image sequences. 

5.5. Results and Discussion. Table presents the accuracy of the different 
algorithms in terms of PSNR and CC. We evaluated these criteria at t = 5 for data 
set #1 and at t = 7 for data sets #2 and #3. We first note that our SR method 
yields better figures of merit than the other methods for the different data sets of the 
benchmark. It improves slightly the CC and substantially the PSNR (more than a 
unit) for each data-set configuration. Second, the estimates released by the proposed 
approach seem to achieve a good quality level irrespective of the considered data set: 
on the contrary, the multi-frame SR algorithm m performs fairly well on data set 
#1 but its performance collapses on data set #3; the kernel-regression SR algorithm 
[HllS] obtains good results for data set #2 while yielding only a slight increase of 
the accuracy with respect to a Lanczos interpolation on data set ^3; the single-frame 
SR algorithm [43] has a good behavior for data set #3 but is less competitive for 
data set ^1. Third, the performance of our algorithm seems to be comparable for 
different motion initializations, in particular for initial motion fields obtained from 
the optic-flow algorithms of m or m- 


^ T n 

MBAE=—y y 

riT 


arcos 


s=l 
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PSNR(f) 

CC(t) 

set ff 1 

set # 2 

set # 3 

set ff 1 

set ff 2 

set ff 3 

t—5 

t=7 


t=5 

t=7 

t=7 

Nearest neighbor 

27.236 

22.639 

22.656 

0.9771 

0.9648 

0.9556 

Lanczos interpolation [52] 

27.845 

24.571 

23.040 

0.9800 

0.9775 

0.9579 

Single-frame SR [43] 

28.359 

32.518 

23.971 

0.9815 

0.9949 

0.9684 

Kernel-regression SR [131 EH] 

28.944 

33.275 

23.394 

0.9838 

0.9957 

0.9643 

Multi-frame SR [37] 

29.935 

32.295 

21.948 

0.9844 

0.9939 

0.9491 

Proposed (optic-flow init. [61]') 

30.634 

35.027 

25.007 

0.9868 

0.9969 

0.9750 

Proposed (optic-flow init. |57|1 

30.790 

34.305 

25.302 

0.9872 

0.9963 

0.9767 


Table 5.2 

Accuracy of super-resolved image estimates in terms of PSNR and CC at time t. 


The improvement brought by the proposed method can also be seen by a visual 
inspection of the reconstructed images in Fig. and We can first underline 

the enhancement provided by the inclusion of some motion information in the SR 
reconstruction process by comparing the estimates released by the “single-frame” and 
the “multi-frame” / “sequential” algorithms. In Fig. and one can notice that 
the estimated contours and the texture are over-smoothed if no motion information 
is included. This is for example visible by inspecting the fuzzy girl’s eyebrow or 
the smoothed scales of the little dragon in Fig. distinguishing the tongue of the 
foreman or analyzing the texture of the grass field of the football game in Fig. In 
comparison, our algorithm enhances the reconstruction accuracy of these details as 
visible in Fig. [^and[^ The drawback of including motion is that, as it can be noticed 
for the little dragon, errors in motion discontinuity estimation may induce imprecision 
on the contours and lead to some undesirable oscillations. 

Although not as accurate as the proposed method, we note the good performance 
of the single-frame SR algorithm proposed in [33] . Clearly, it is competitive with other 
state-of-the-art approaches exploiting motion information. This is probably due to 
the relevance of the sparse prior employed by the single-frame SR algorithm [43] . This 
is particularly striking when the motion in the video is too difficult to exploit by the 
multi-frame or kernel-regression SR algorithms, as shown for the challenging football 
sequence in Fig. and[^ Let us also mention that results obtained with a kernel- 
regression SR strategy reveal a slight enhancement in comparison to standard spatial 
interpolation techniques, which is probably induced by the implicit introduction of 
the motion information via the modeling of the local spatio-temporal structures of 
the sequence. 

Our experiments also emphasize several examples where a sequential SR setup can 
solve some reconstruction ambiguities which can be difficult to treat in a multi-frame 
framework. In Fig. and some erroneous reconstructions, which do not appear 
in the proposed method, can be noticed in the multi-frame estimates: for example, 
artefacts in the girl’s eye in Fig. deformations of the foreman’s tongue and the 
fuzziness of the stripes of the football player’s trousers in Fig. Indeed, matching 
all the images of the sequence with a reference frame is often a more difficult task 
than estimating motions between consecutive frames. In the former situation, motion 
estimation has to deal with large displacements between distant frames whereas, in 
the latter setup, the problem simplifies into the estimation of a succession of small 
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LR estimate 

SR estimate 

MEPE 

MBAE 

MEPE 

MBAE 

Zach et al, 2007 | 61 |. 
Xu et at, 2012 |37|. 

1.319 

1.342 

24.988 

25.592 

1.302 

1.320 

24.975 

25.545 


Table 5.3 

Accuracy of low-resolved or super-resolved optic-flow estimates in terms of MEPE and MBAE, 
with respect to the motion initialization algorithm. 


displacements. In other words, a SR multi-frame setup will try to match images of 
the sequence which could apparently seem independent, with the potential drawback 
of estimating erroneous structures. On the other hand, a SR sequential setup prop¬ 
agates information through consecutive frames and may better succeed in modeling 
the overall dependences in the image sequence. One could nevertheless argue that the 
estimation of inter-frame motions could also lead to error propagation if the motion 
estimates are inaccurate. This is not what we observed in our simulations: motion 
errors are usually absorbed by the error terms (which increase in the region where 
the motion is badly estimated). This is illustrated in Fig. we observe that et may 
be large on the contours of the characters (where the quality of the motion estimation 
is typically low) but the PSNR is nevertheless stable across the reconstructed image 
sequence. Therefore, as observed from our simulations, a sequential SR approach is 
usually better conditioned to deal with videos such as data sets #1 and #3, which 
exhibit large displacements and/or occlusions. 

Finally, let us notice that there is a positive interaction between the estimation 
of the motion fields and the HR images: intuitively, it is clear that a good estimation 
of the HR image sequence will improve the quality of the estimated motion fields; 
similarly, a good estimation of the super-resolved motion helds will enhance the accu¬ 
racy of the estimated image sequence. Although this positive interaction is difficult 
to ensure from a theoretical side, we have often observed it in practice. We illustrate 
in Table |5.3| and Fig. 0 the benefit of refining the motion estimation through our 
iterative procedure for the synthetic data set #1, independently of the initial motion 
estimate. In Table [STSl we can notice a slight gain in terms of MBAE and MEPE in 
comparison to a direct estimation of the motion from the LR observations with the 
methods presented in m or |61] (which serve as initializations for our algorithm, see 
section 5.1). More interestingly, we note in Fig. |^that the motion held released by 
the proposed approach exhibits sharper discontinuities than those output by m or 

ETl. 


6 . Conclusion. We have presented a new methodology to solve video SR prob¬ 
lems, i.e., to reconstruct a HR image sequence from LR observations. The HR se¬ 
quence is entirely described by a parametric non-linear sequential model, which con¬ 
nects the different images of the sequence. It is parametrized by a hnal condition, 
a sequence of non-global displacement helds and a sequence of additive noises. In 
order to compensate for the ill-posedness of the video SR problem, we considered 
priors enforcing some forms of sparsity on the unknown parameters of the system. 
The joint estimation of the hnal condition, the displacement and the noise sequences 
was expressed as a constrained minimization problem which, in the general case, is 
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high-dimensional, non-differentiable and non-convex. We provided elementary build¬ 
ing blocks to tackle each of these difficulties, and, by gathering them, designed a 
convergent optimization algorithm enjoying a complexity (per iteration) linear in the 
problem dimensions. Our numerical simulations on several video benchmarks show 
that the proposed SR method is competitive with state-of the-art. In particular, the 
gain appears to be particularly important for videos involving complex motions with 
large amplitudes and occlusions. 


Acknowledgements. The authors wish to acknowledge C. Deltel and S. Cam¬ 
pion for their technical support in numerical simulations. 


Appendix A. Proof of (4.5)-(4.7). The proof of this specific feacfcward optimal 


control solution follows the sketch of the demonstration for the more standard forward 
problem presented in [6]. We will focus on the following optimization problem: 


argminT(x = Q(e, d, c), e, d, c), 

(e.d.c) 


(A.l) 


where T denotes some objective function to be defined below. We recall that, given 
(e,d, c), the function Q(e,d,c) determines a unique vector x = Q(e,d,c) satisfying 
the constraints in (4.1), see section 4.1 In this appendix, we will use the following 


short-hand notation for the constraints in (4.1): 


xt — Ct+i, d(+i), 

XT = Dc, 


0 < t < T - 1, 


(A.2) 


with J^i(x(+i, ej+i, dt_|_i) = 'P(xt_|_i, dt+i) -|- e^+i. We will also alleviate the notation 
for the constraint x = Q(e, d, c) by denoting this vector simply by x. Therefore, x 
should be understood as a function of e, d, c and no longer as an independent variable. 


The proof of (|4.5|)-(4.7) is made of two different parts, in which we study different 


instances of optimization problem (A.l). In a first step, we will consider an objective 
function only depending on the initial stat 

(A.3) 


r(x,e,d,c) = 5o(xo). 


Then, in a second step, we will come back to the more general problem (4.2), i.e., an 


optimization problem where the objective function T will match the cost function. 


J, given in (4.3): 


T-l 


T(x, e,d,c) = 0o(xo) + ^ Qtixt,£t,dt) + er, dr, c) = J'(x, e,d, c). (A.4) 


with the 
at some 

point in the set 


First Part of the Proof. We begin by considering problem (A.l) 
objective function (A.3). By the chain rule of derivation applied to (A.2 


(x',e',d',c') 


dj+i), 0<t<T 1 

Xt = Dc' 


(A.5) 


mentioned previously, xq must be understood as a function of e, d, c. 
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we can decompose the gradients into the products: 


Ve,r(x', e', d', c') = V,, e;, dO 

Vd,r(x', e', d', c') = Vd, j-;_i(x;, e;, d;) 


•Vx,J-rVx,J-o*Vxo0o(x'o), 

•Vx,J-rVx,J-o*Vx„go(x'), 


Ver(x',6',d',c') •••Vx.J^r Vx^J-Q Vxo0o(x[,), 

(A.6) 

where we recall that Vxj^t-i denotes the Jacobian matrice of J-t-i with respect 
to function x^ evaluated at (X(,ej,dj) and Vxt^t*_i its transpose. We can rewrite 
gradients in (A.6) in order to exhibit their recursive structures. By defining the 


forward recursion 


f Co “ Vxo5o(Xo), 

1 Ct = Vx,j-;_iCt-i, i<i<T, 

we obtain the following rewriting: 

r Ve,r(x',e',d',c') = Ve,j-;_i(x;,e;,d;)c_i, 

Vd,r(x',e',d',cO = Vd,j-;_i(x;,e;,d;)c_i, 

[Vcr(x',e',d',cO=D*CT- 


l<t<T, 

l<t<T, 


(A.7) 


(A.8) 


Second Part of the Proof. We now consider problem (A.l) with objective 
function (A.4). By making a change of variables, we want to obtain a rewriting of 
function (A.4) with a structure analogous to (A.3), so that the gradients are given 
by a recursion of the form (A.7)-(A.8). In other words, by making some change of 
variables we intend to rewrite the sum of functions in (A.4) as a unique function 
depending solely on an “initial state”. In order to do so, let us define variables k^’s 
recursively as follows: 


( kt = 0 , 

< k;t-i(xt, £t, dr, c) = kt + Gt{^t, £t, dr, c), 

[ Kt_i(xt,et,dt,c) = Kt +0t(xt,et,dt), T - 1 > f > 1. 


We then obtain that 


T-l 


k:o(x, e, d, c) = ^ Ct, d*) + Gt{xt, ct, dr, c). 


t=i 


and the objective function T given in (A.4) can be rewritten as 
T(x, e, d, c) = k:o(x, e,d,c) + C/o(xo). 


(A.9) 


Considering the following change of variables it — ( 


, we then have that the right- 


hand side of (A.9) can be rewritten as a function of xq only. In the sequel, we will 
use the following specific notation to emphasize this fact: 


00 (xo) = Ko + 00 (xo). 


(A.IO) 
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Moreover, it is easy to see that functions x^’s satisfy the following backward recursion: 


Xt —-^t(xt+i, dj+i), T — 1 > t > 0, 

xr = Trier, dp, c), 


(A.ll) 


where 


Ct+i, dt_|_i) — 

Trier, dy, c) = 


et+i, dj+i) 

<«t+i + Gt+ii^t+i, et+i, dt+i )J ’ 

Dc \ 

0t(Dc, er, dr, c) J 


We remark that the cost function (A.101, recursion (A.ll I and the set 


(x',e',d',c') 


~d(_(_]^), 0<t<T 1 

Sc'rp = Trie'r, d^, e!) 


(A.12) 


have respectively the same structure as (A.3), (A.2) and (A.5). We can then apply 


the result obtained previously and get the gradients of T using the same reasoning as 


the one made to derive (A.7)-(A.8). More specifically, let (x', e', d', c') be some point 
in (A.12), and let Ct be an “adjoint” variable verifying: 


Co “ Vxo^o(Xo), 


(A.13) 


where the Jacobian matrix of Tt-i evaluated at some point (X(,ef,d() is denoted 
Vx^Tt-i- Using (|A.8), we obtain the following expressions: 


Vd,r(x',e',d',c') = VdjJ';_i(x;,e;,d;)Ct_i, 1 < t < T, 

v,,r(i',e',d',c') = v,,.F;_i(i;,e;,d;)Ct_i, i < t < t, 
Vcr(i', e', d', c') = Ve.F; (6^, d^, c')Ct- 


To finalize the proof, we re-express recursion (A.13) by developing it with respect 

where the Ct’s 


to the two different components of the adjoint variable Ct = 


A I Ct 
wt 


(A.13) by taking (A.10) into account, we obtain 


have the dimension of xt’s and Wt’s are scalars. Particularizing the first equation in 

(A.14) 


Co^ _ ^ MoiK) 


Wo 


V, 

^^oGoi%)J V 1 


Moreover, using the definition of Tt, the second equation in (A.13) leads to 


Ct\ ^ Vx,0t(x'„ei,d;)\ /Ct-i 

wty \ 0 1 J \LUt-i 

Ct\ _ f^xT^r-i e^, dy, c')\ /Ct-i 

ojt / V 0 1 J 1 


1 < t < T - 1, 


(A.15) 


(A.16) 
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Equations (A.14|-(A.16) imply that uJt 
equivalent to (4.7). 


1 Vt; moreover, the recursion in is 


Appendix B. The Alternating Direction Method of Multipliers. The 

alternating direction method of multipliers (ADMM) focusses on the following type 
of optimization problems: 

min Giizi) + 02 (^ 2 ), s.t. Azi+'Bz 2 = Qr, (B.l) 

ziGHi,Z 26 Hi 


where A S lR,’’^”b B S 11’’^"'=, Gi : lR”i —R, ^2 : R are closed, proper 

and convex functions, and 5i, 52 are non-empty convex sets. We note that the 
conditions on Qi and G 2 are pretty mild; in particular, Qi and G 2 are not required to 
be differentiable and can take on infinite values. 

ADMM is an iterative procedure inspired by the well-known method of multipliers 
[^. It searches for a minimizer of (B.l) by sequentially minimizing the correspond¬ 
ing augmented Lagrangian with respect to each primal variables Zi and Z 2 , before 
updating a dual variable u G R'". Formally, the ADMM recursions take the form: 


=argminei(zi) + ^||A;2i+B4'=^+u«||i (B.2) 

ziGHi Z 

=argmin02(^2) + ^||A4'=+'AB;22 + u«||^, (B.3) 

Z2GH2 z 

u(fc+i) ^ u(fc) (B.4) 


for some p > 0. 

ADMM has recently sparked a surge of interest in the signal-processing commu¬ 
nity for several reasons. First, the conditions on Gi and G 2 in (B.l) (i.e., closed, proper 
and convex) are mild and (B.l) therefore encompasses a large number of optimization 
problems as particular cases. Second, the ADMM recursion (B.2)-(B.4) converges to 
a solution of (B.l) under very general conditions, see section 3.2]. Third, although 
ADMM is known to be slow to converge to a solution with high accuracy, it has been 
shown empirically that ADMM converges to modest accuracy in a few tens of itera¬ 
tions. 


Appendix C. Algorithm’s Details. 

C.l. First Building Block: Computation of (4.5)-(4.7). In this appendix, 
we complement the exposition done in section [FT] on the fast evaluation of gradient of 
cost function J^(e,d,c) given in (4.3), particularized to the model parameters speci¬ 
fied in sectionlS.ll First of all, we expose the particularization of recursions (4.5 )-(4.71 


to this setting. It is straightforward to see that it results in the following procedure: 


i) Compute sequence by the backward recursion: 

J Xy = Dc', 

\ ~ ‘it-i-i) + ^t+i- 

ii) Compute sequence {Ctl^o by the forward recursion: 

r Co = 2V^o^*(^(xo)-yo), 

1 = 
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in) Compute the gradients: 

r V,,J(e',d',c') = C-i+2aie;, 

I Vd.^(e', d', c') = EtCi_i + 2a2WiGG*d;, 
[ Ve^(e',d',c') = D*CT + 2a3c' 


where Wj € and S are respectively diagonal and block-diagonal 

matrices which will be defined in the following. We detail hereafter the elements of 
the procedure which have not been fully described yet. 

We begin by making some comments on the evaluation of the warping function 
7^(X(,dJ) and its Jacobian Vxj7^(xj, dj), which constitute the core of the recursion. 
We propose to use the family of bi-dimensional cubic cardinal splines {'i/'i}r=i 
representation (3.3). In practice, we compute an equivalent representation based on 


the family of bi-dimensional cubic B-splines functions Indeed, this repre¬ 

sentation presents some computational advantages because of the existence of fast 
B-splines transforms. The relation between cardinal cubic splines and cubic B-splines 
functions is given in [54]. This reference also provides details on the fast cubic B- 
splines transform by recursive filtering. Let matrix C* = [ci, ...,c„]* G R"^" denote 
the direct B-spline transform of a discrete bi-dimensional signal, i.e., the transform 
computing from a discrete signal Xt its representation with spline coefficients G*Xt. 


Rewritten (3.3) with cubic B-spline functions, we get: 


P,(xt,dt)= ( 

iGi5(x(s)-l-dt(s)) 


'xt (/>i(x(s) +dt(s)). 


(C.l) 


where ■d(x(s)-|-dt(s)) denotes a subset of vector indices corresponding to the neighbor¬ 
hood of the spatial position x(s) (which differs from the subset V previously defined 
in (3.3)). To simplify notations, we denote by I: R" x R^” —>■ R” the function taking 
as a first argument spline coefficients C*X( and as a second argument a motion field 
dt, and whose s-th component is given by (C.l). Using this notation, (C.l) can be 
rewritten in the vectorial form 


P(xt,dt)=Z(C*Xt,dt). 

We denote by VI(C*X(,dt) the Jacobian of function Z at point (C*X(,dt) with 
respect to its first argument, he., spline coefficients. Since function Z is linear with 
respect to spline coefficients, the Jacobian is only dependent on the value of its second 
argument, he., dj. Therefore, we will adopt the notation VZ(dt) in the sequel. 

The complexity of evaluating both spline coefficients C*Xt and the interpolated 
function Z, scales linearly with the image dimension, he., 0{n), thanks to the rep¬ 
resentation separability and to recursive linear filtering [54] . Multiplication with the 
Jacobian transpose 

Vx,Z*(xt+i,dt+i) = CVZ*(d*+i), 

implies also a linear complexity: first, matrix C is symmetricp^so that it is identical to 
the direct B-spline transformation C*, computed by recursive linear filtering; second, 
the multiplication of the Jacobian transpose of function Z with vector is equal to 

VZ*(d,)Ct(s)= E Ci(*)</>«(x(*)+dt(*)). 

z|sGi3(x(i)+dt(d) 


^^Matrix C is symmetric in the case of periodic boundary conditions m- 
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Concerning the Jacobian transpose it is easy to see that this matrix is an 

up-sampling operation, inserting zeros, followed by the same low-pass filtering as in 

n. 

We continue by detailing matrices appearing in the last step of the procedure. 
First, we note that matrix D* is simply the direct Real-valued Fast Curvelet Trans¬ 
form. This transform is, as well as its transpose D, based on fast Fourier transforms, 
whose complexity scales in O(nlogn) [35]. Next, the two diagonals of the two-block 
matrix Ej are the two n-dimensional vectors (I(C*Xt,d()) for j = 1 , 2 , where 
Sj denotes the j-th spatial coordinate. We approach these partial derivatives by 
second-order centered finite differences. Then, the diagonal of matrix W( is the vec¬ 
tor concatening twice the weight vector wj, i.e., Wt{s,s) = W((2s, 2s) = Wt(s) for 
s = 1 ,..., n. 

To finalize the description of this procedure, it remains to give some details on 
matrix G. Let the elements of vector G*dt be first-order forward finite-difference ap¬ 
proximations of the spatial gradients of the two motion components, which have been 
rearranged beforehand on the pixel grid. This gradient approximation becomes exact 
assuming that components of vector dt are coefficients associated to the decomposi¬ 
tion of some continuous motion held in a basis of interpolating and separable scaling 
functions (see a proof in [32]). Straightforward calculus then shows that elements 
of vector GG*dt are second-order hnite difference approximations of the Laplacian 
of the two motion components, which have been rearranged beforehand on the pixel 
grid. 


C.2. Second Building Block: ADMM Solver for Problems (4.17) and 


(4.19). In this appendix, we present an ADMM implementation of the two minimiza¬ 


tion problems (4.17) and (4.19) appearing in the procedure described in section 4.3 
(which also corresponds to Algorithm 4 introduced later on in section]^. In the fol¬ 


lowing, iterations of the 2-step recursion presented in section [4.3] will be indexed by 
the exponent in order to differentiate them from the iterations related to ADMM, 
which will be indexed by the exponent . 


We begin by the analysis of minimization problem (4.17). 
equivalently reexpressed as: 


This problem can be 


T T 

argmin ^ ll'H(xt) - yt||^ ^ (ai||et||i -h 7|| Vlli) + a 3 ||c||i -h 7 ||Jc||i 

(x,e.c)Gn,(g,c)“ ^ 

{ £* = £*, Vt, 

H*(e*-ef^)=V, Vt, 

c = c, 

C - c(^) = Sc, 

where 

Xt ='P(xt+i, dj_|_i)-I-0<t<r—ll 

XT = Dc J 

Here, we have added four new variables to the problem, e = (ci,..., Ct’), c, 
Se = {Se^, ■■■,Scj,) and Sc, which are counterbalanced by the inclusion of four new 
constraints. We use the formalism exposed in appendix [^ with Z\ = (x, e,c). 










26 


Z2 = (e, c, Ig, 5c), = ri and ^2 = x R'^ x R"^ x R'^ and obtain the following 

ADMM recursions: 


(x 


(fc+l) g(fe+l) j.(/c+l)'( ^ 


~(fc+l) _ 
— 

g(fc+l) _ 
J(fc+1) _ 


(fe+1) 
So 


u 


) = argmin C^'^\^,e,c) + ^ ^ ||H*(e* - - 5« + u,, 

{x,e,c)Gil 


(fc )||2 




+ ^||c-cW-5W+uW||2, 


arg mirig^ 
arg min^ 
arg min^ 
arg min^ 


\^t\\i -t 

lieilid 

ll^eJIl 


Pi 

2ai 


^(fe+i) 


~ 


_P3_llr('=+l) - , 


dflli, 

,( fe )||2 


2 q :3 II 


TT^/ (fc+1) ^(^)'\ X in^^^||2 

^ K^t -^tJ- II2. 


||5c|li + |^||c('=+i)-cW-5c + u 


(fe)l 


- Lie 


u(fc) 

“et 

(fc) 


(fc+1) ~(k+l) 

~ > 
g(fc+l) _ 

(k+1) 


+ H*(ef- ef^) - 

ik] ^ ^(fc+i) _ _ ^(fc+i) 


(C.2) 


(C.3) 


(C.4) 


where is defined in (4.14). Equatio ns (|C.2[ ), ( |C.3[ ) and ( |C.4| ) correspond re¬ 
spectively to expressions (B.2), (B.3) and (B.4| in appendixWe comment on the 
two first steps of the ADMM algorithm, the last one being trivial. First, as already 
mentioned, problem (C.21 has the same structural form as the problem addressed in 
section 14.11 


We thus apply the methodology described in section |4.1| to solve this 
problem via a gradient descent algorithm. The core of this methodology is the com¬ 
putation of the gradient of the cost function with respect to c and e. The gradient 
efficient evaluation relies on a backward-forward recursion possessing the structural 


form of the first building block constituted by equations (4.5)-(4.7). Some details of 
the implementation on (4.5)-(4.7) are provided in appendix C.l for the particular 


case of the model parameters given in section [5T] 

We remark that the complexity associated to the evaluation of the gradient scales 
as 0{nT q). Second, the optimization problems specified in (C.3) all have simple 


analytical solutions based on soft-thresholding operators (4.16). We immediately 


remark that the two first updates in (C.3) are identical to the ADMM steps (4.12) 
used to treat the convex case in section \'iJ2\ Moreover, the solutions to the last two 
problems in (C.3) are given by 


= soft 2 




= soft2 -I- 


Vi, 


(C.5) 


where h, is the ith column of H and “soft” denotes the soft-threshloding operator 


defined in (4.16). 


We continue with the analysis of minimization problem (4.19). We first remark 


that we can apply the methodology described in section to compute the gradient 
VdtR(d(^)) required to build the quadratic approximation (4.20). Once this quadratic 
approximation has been obtained, the task now is to solve minimization problem 
( 4.19[ ). We can notice that this problem does unfortunately not possess an explicit 
solution. To circumvent this issue, we use an ADMM strategy, as detailed below. 
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Problem (4.19) is reexpressed as: 


argmin + 0!2 EVdt) 

d,(di,....dr) t=l 

s.t. G*dt = dt, yt. 


(C.6) 


Here, we have added the new variables dj’s to the problem which are counterbalanced 
by the inclusion of new constraints. We use the formalism exposed in appendix 
with 21 = (di,...,dT), Z 2 — (di,...,d7’), Si = and S 2 = and obtain the 

following ADMM recursions: 


= argminR(d,d(^)) + y E - dj'"’ + 


l(^) 






j(fc+i) _ ai-gmin 7^(d() 


^1 

2a2 


|G*d 


(fc+i) 


- dt 


u 


(fc )||2 
dt Il2) 


(fc+1) (fc) 

"dt =< 


G*d 


(^ + 1) ^(^+1) 


(C.7) 

(C.8) 

(C.9) 


Equations and ( |C.9[ ) correspond respectively to expressions (Q, dBRt 

and (B.4) in appendix [b| 

We comment now on the resolution of (C.7) and (C.8). First, the unconstrained 
differentiable problem ( |C.7[ ) can be easily solved via a gradient descent algorithm. 
The gradient of the cost function in (C.7) with respect to dj can be expressed as 

VdtR(d(^)) + a(^)(dt - d^) + p2G(GMt - df ^ + u^). (C.IO) 

As mentioned previously, Vdt^(d^^^) is simple to evaluate via the recursions described 



It is given for any j € Si with 1 < i < n by 


0 if r* < a2w(i)/p2, 

(jlK+rjQ-) ^ ^ Q,2w(i)/p2) ^„*j(fc+l) 


otherwise. 


where the scalar is given by 


j&Si 






U 
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Fig. 1. Data set #1: first and last frames of the ^‘bandage” sequence. 


Fig. 2. Data set #2 : first and last frames of the ‘foreman” sequence. 





Fig. 3. Data set #3: first and last frames of the ‘football” sequences. 
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Fig. 4. “Single-image” SR estimates for data set #1. Details of the SR images obtained 
with nearest neighbor strategy (1-st row), Lanczos interpolation (2-nd row) and the learning algo¬ 
rithm proposed in (S-rd row). 
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Fig. 5. Multi-frame and sequential SR estimates for data set #1. Details of the SR images 
obtained with the multi-frames algorithms of UBEir (1-st row) or \37j (2-nd row), and with the 
proposed sequential algorithm (3-rd row) in comparison to ground truth (A-th row). Initialization 
of our algorithm relies on the optic-flow method m 


































34 



Fig. 6. “Single-image” SR estimates for data set #2 and #3. SR images and details 
obtained with nearest neighbor strategy (1-st row), Lanczos interpolation (2-nd row) and the learning 
algorithm of \43f (3-rd row). 
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Fig. 7. Multi-frame or sequential SR estimates for data sets #2 and #3. SR images and 
details obtained with the multi-frame algorithms of \49\ \4^ (1-st row) or (2-nd row), and with 
the proposed sequential algorithm (3-rd row) in comparison to ground truth (4.-th row). Initialization 
of our algorithm relies on the optic-flow method m- 
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Fig. 8. Optic-flow SR estimates. Motion field estimated from low-resolved images of data 
set #1 at initial time. Top: estimates for state-of-the-art algorithms (left) and m (right). 
Middle: estimates with the proposed SR algorithm initialized with (left) or ISIS (right). Bottom: 
ground truth and associated colormap. 
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Fig. 9. Reconstruction of three super-resolved images xt (3-rd row), optic-flow fields dt (1-st 
row) and warping errors et (2-nd row) for data set #3 corresponding to t = 3 (left) t = 5 (middle) 
and t = 7 (right). True images (A-th row) and associated PSNR (computed without quantification of 
the estimates and including the image borders) are displayed below. Initialization of our algorithm 
relies on the optic-flow method \57j. 















